Speaker conversion in ARX-based source-
نویسنده
چکیده
A speaker conversion framework for formant synthesis is proposed. With this framework, given a small set of a target speaker’s utterances, segmental features of an original speech can be converted to those of the given speaker. Unlike other speaker conversion frameworks, further voice quality modification can also be applied to the converted speech with conventional formant modification techniques. The parameter conversion is based on MLLR in the cepstral domain. The effect of parameter conversion can be seen from the graphical representation of formant placement. The results of an auditory experiment showed that most of the converted speech was perceived as being similar to that of target speakers.
منابع مشابه
طراحی یک روش آموزش ناموازی جدید برای تبدیل گفتار با عملکردی بهتر از آموزش موازی
Introduction: The art of voice mimicking by computers, has with the computer have been one of the most challenging topics of speech processing in recent years. The system of voice conversion has two sides. In one side, the speaker is the source that his or her voice has been changed for mimicking the target speaker’s voice (which is on the other side). Two methods of p...
متن کاملVoice Conversion Based on Speaker-Dependent Restricted Boltzmann Machines
This paper presents a voice conversion technique using speaker-dependent Restricted Boltzmann Machines (RBM) to build highorder eigen spaces of source/target speakers, where it is easier to convert the source speech to the target speech than in the traditional cepstrum space. We build a deep conversion architecture that concatenates the two speakerdependent RBMs with neural networks, expecting ...
متن کاملCross-language voice conversion based on eigenvoices
This paper presents a novel cross-language voice conversion (VC) method based on eigenvoice conversion (EVC). Crosslanguage VC is a technique for converting voice quality between two speakers uttering different languages each other. In general, parallel data consisting of utterance pairs of those two speakers are not available. To deal with this problem, we apply EVC to cross-language VC. First...
متن کاملSequence error (SE) minimization training of neural network for voice conversion
Neural network (NN) based voice conversion, which employs a nonlinear function to map the features from a source to a target speaker, has been shown to outperform GMM-based voice conversion approach [4-7]. However, there are still limitations to be overcome in NN-based voice conversion, e.g. NN is trained on a Frame Error (FE) minimization criterion and the corresponding weights are adjusted to...
متن کاملTo Investigate the Accuracy of the Dynamic Time Warping Based Transformation Function for Voice Conversion
Voice conversion involves transformation of speaker characteristics in a speech uttered by a speaker called source speaker so as to generate a speech having voice characteristics of a desired speaker called target speaker. Voice conversion technology is used in many applications namely dubbing, to enhance the quality of the speech, text-to-speech synthesizers, online games, multimedia, music, c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003